Skip to content

[BEAM-4783] Fix issues created in #6181.#7690

Closed
kyle-winkelman wants to merge 1 commit intoapache:masterfrom
kyle-winkelman:beam-4783
Closed

[BEAM-4783] Fix issues created in #6181.#7690
kyle-winkelman wants to merge 1 commit intoapache:masterfrom
kyle-winkelman:beam-4783

Conversation

@kyle-winkelman
Copy link
Contributor

@kyle-winkelman kyle-winkelman commented Jan 31, 2019

Fix issues created in #6181 and incorrectly fixed in #6884 (although this PR increased readability greatly).

Before any work on BEAM-4783, GroupCombineFunctions would always use a new HashPartitioner(rdd.rdd().sparkContext().defaultParallelism());.

The intent was to skip the creation of this Partitioner and call groupByKey() with no arguments only when the new bundleSize option was in use. #6181 actually did the opposite causing performance impacts, and because #6181 had terrible readability #6884 did not fix it correctly.

I am also hopeful that this can be cherry picked into 2.10 if another RC is created to get the Spark Runner's performance back to the levels seen in 2.7.

@iemejia @timrobertson100


Follow this checklist to help us incorporate your contribution quickly and easily:

  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

It will help us expedite review of your Pull Request if you tag someone (e.g. @username) to look at it.

Post-Commit Tests Status (on master branch)

Lang SDK Apex Dataflow Flink Gearpump Samza Spark
Go Build Status --- --- --- --- --- ---
Java Build Status Build Status Build Status Build Status
Build Status
Build Status
Build Status Build Status Build Status
Python Build Status --- Build Status
Build Status
Build Status --- --- ---

@iemejia
Copy link
Member

iemejia commented Feb 1, 2019

Thanks @kyle-winkelman sorry for not responding before to your messages. I was aware of the issue, just too busy with other stuff to take a look again. I will do and ping you when done.

Copy link
Member

@iemejia iemejia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM sorry for the delay, and thanks for bringing back this fix @kyle-winkelman.
I am not sure there is going to be a new RC but if there is I will try to include this one. Anyway next version (2.11.0) is going to be cut next week so it will be included there for sure.

@iemejia
Copy link
Member

iemejia commented Feb 8, 2019

Merged manually to correct the commit title and add an extra @Nullable annotation.

@iemejia iemejia closed this Feb 8, 2019
@kyle-winkelman kyle-winkelman deleted the beam-4783 branch February 8, 2019 17:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants